On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition

نویسندگان

  • Marco Mondelli
  • Andrea Montanari
چکیده

We establish connections between the problem of learning a two-layers neural network with good generalization error and tensor decomposition. We consider a model with input x ∈ R, r hidden units with weights {wi}1≤i≤r and output y ∈ R, i.e., y = ∑r i=1 σ(〈x,wi〉), where 〈·, ·〉 denotes the scalar product and σ the activation function. First, we show that, if we cannot learn the weights {wi}1≤i≤r accurately, then the neural network does not generalize well. More specifically, the generalization error is close to that of a trivial predictor with access only to the norm of the input. This result holds for any activation function, and it requires that the weights are roughly isotropic and the input distribution is Gaussian, which is a typical assumption in the theoretical literature. Then, we show that the problem of learning the weights {wi}1≤i≤r is at least as hard as the problem of tensor decomposition. This result holds for any input distribution and assumes that the activation function is a polynomial whose degree is related to the order of the tensor to be decomposed. By putting everything together, we prove that learning a two-layers neural network that generalizes well is at least as hard as tensor decomposition. It has been observed that neural network models with more parameters than training samples often generalize well, even if the problem is highly underdetermined. This means that the learning algorithm does not estimate the weights accurately and yet is able to yield a good generalization error. This paper shows that such a phenomenon cannot occur when the input distribution is Gaussian and the weights are roughly isotropic. We also provide numerical evidence supporting our theoretical findings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the convergence speed of artificial neural networks in‎ ‎the solving of linear ‎systems

‎Artificial neural networks have the advantages such as learning, ‎adaptation‎, ‎fault-tolerance‎, ‎parallelism and generalization‎. ‎This ‎paper is a scrutiny on the application of diverse learning methods‎ ‎in speed of convergence in neural networks‎. ‎For this aim‎, ‎first we ‎introduce a perceptron method based on artificial neural networks‎ ‎which has been applied for solving a non-singula...

متن کامل

Effect of sound classification by neural networks in the recognition of human hearing

In this paper, we focus on two basic issues: (a) the classification of sound by neural networks based on frequency and sound intensity parameters (b) evaluating the health of different human ears as compared to of those a healthy person. Sound classification by a specific feed forward neural network with two inputs as frequency and sound intensity and two hidden layers is proposed. This process...

متن کامل

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...

متن کامل

An Analysis of the Connections Between Layers of Deep Neural Networks

We present an analysis of different techniques for selecting the connection between layers of deep neural networks. Traditional deep neural networks use random connection tables between layers to keep the number of connections small and tune to different image features. This kind of connection performs adequately in supervised deep networks because their values are refined during the training. ...

متن کامل

Growing Arti cial Neural Networks Basedon

| With this paper we propose a learning architecture for growing complex ar-tiicial neural networks. The complexity of the growing network is adapted automatically according to the complexity of the task. The algorithm generates a feed forward network bottom up by cyclically inserting cascaded hidden layers. Inputs of a hidden layer unit are locally restricted with respect to the input space by...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1802.07301  شماره 

صفحات  -

تاریخ انتشار 2018